Avoid shell invocation for local log tailing#2580
Conversation
|
hey! not sure what the purpose of this pr is, can you explain the current painpoint? |
|
Thanks for asking. The pain point is local launcher robustness around log paths. Both local entrypoints currently build shell strings for log streaming ( This PR keeps the training/orchestration processes unchanged. It only starts the log tailing pipeline with argv lists so the log path is passed literally, and the tests cover a path containing shell metacharacters plus the torchrun-prefix stripping pipeline. If this defensive hardening is not worth the extra helper in this codebase, I can close or reduce the patch. |
|
@resolvicomai if we can remove the tests form the pr i'd be okay with merging this |
Summary
Testing
Note: I used --no-project on macOS because the checked-in uv lockfile only supports Linux environments.
Note
Low Risk
Low risk: only changes how local launcher log streaming spawns
tail/sed, reducing shell-injection exposure while keeping core training/orchestration execution unchanged.Overview
Updates local
rlandsftlaunchers to stream logs via a newstart_tail_processes()helper instead ofPopen(..., shell=True).Adds
start_tail_processes()inutils/process.pyto runtail -F(optionally piping throughsedto strip torchrun rank prefixes) using argv lists, and includes unit tests covering both modes and log paths containing shell metacharacters.Reviewed by Cursor Bugbot for commit ddf3b39. Bugbot is set up for automated code reviews on this repo. Configure here.